Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix professors for sections not being scraped #244

Merged
merged 17 commits into from
Jan 21, 2025
Merged

Conversation

mehallhm
Copy link
Contributor

@mehallhm mehallhm commented Nov 19, 2024

Purpose

Scrapes the professors for a section. Because the Banner API removed this field from the search results when not authenticated through Northeastern SSO, this has to be acquired via a separate request to the section professor details page.

Tickets

Feature List

  • When scraping sections, fires an additional request to the section's professor details page to get the faculty information.

Notes

This is kinda thrown together and the implementation very much can be improved. For now, it gets the job done but the scraper needs some love... I think there is code in there that is tested but never used?

Reviewers

Primary:
@ananyaspatil

Secondary:
@nickpfeiffer05 @Anzhuo-W @cherman23

UPDATE: As troubleshooting the OpenSSL error, the Yarn Berry PR #241 was merged into this PR. For documentation, the PR comment has been reproduced below.

Purpose

Migrates off of the end-of-life Yarn Classic to the actively maintained Yarn Berry version. In general, it is a drop in replacement, using the Node experimental Corepack as the preferred installation / management tool (the irony of managing a package manager).

Tickets

Closes #225, Closes #232

Feature List

  • Switches from Yarn Classic to Yarn Berry in node_modules compatibility mode.
  • Updates the Dockerfile to enable Corepack and use the same version of Yarn as dev, ensuring consistency.
  • Prepped Dockerfile for a better build system in the future. Yarn Berry forces us to stop abusing Yarn like we were before so some of this needed to happen now anyway. This will be something that will show its value soon...
  • Updates the GH actions to use Corepack to install deps (and generally get them to newer versions).

Notes

I will open a PR on searchneu soon to ensure that there is a version locked in because Corepack likes to add itself otherwise.

@coveralls
Copy link
Collaborator

coveralls commented Nov 19, 2024

Coverage Status

coverage: 79.741% (-0.4%) from 80.173%
when pulling 80c6940 on scraper-prof-update
into a5d14a0 on master.

Copy link
Member

@Lucas-Dunker Lucas-Dunker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, merge and rescrape asap! Just remove that console.log lol

@mehallhm mehallhm added the bug label Jan 16, 2025
@mehallhm mehallhm self-assigned this Jan 16, 2025
@mehallhm mehallhm mentioned this pull request Jan 21, 2025
@mehallhm mehallhm merged commit addbf55 into master Jan 21, 2025
10 checks passed
@mehallhm mehallhm deleted the scraper-prof-update branch January 21, 2025 20:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Teaching faculty for classes not being scrapped Upgrade all GitHub actions Upgrade to Yarn Berry
3 participants